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Part I. Fitting Functions of a Single Argument 
1. Introduction 

The ready availability of calculators and computers has had a profound effect on the use of tables of com- 
plicated functions. For example, in statistical work one may be required in a specific computer program, to 
call on critical values of the F distribution for specified degrees of freedom in the numerator and the denom- 
inator, at specified levels of significance. It is totally impractical to store the entire F table in the memory of 
the computer, but it is entirely feasible to let the computer calculate the required value by a suitable approx- 
imation formula. 

Similar situations occur for physical or chemical properties that are tabulated as functions of temper- 
ature, pressure, wave-length, etc. 

The object of this paper is to present a widely applicable procedure for finding empirical representations 
of tabulated values. The tabulated values are of course assumed to be derived from reasonably smooth func- 
tions of the arguments. The approximation formulas are expected to generate values that are practically 
interchangeable with the corresponding tabulated values. 

We will present the procedure in three parts. Part I is concerned with the empirical fitting of curves, i.e., 
functions of a single argument. Part II deals with functions of two arguments. The case of functions of more 
than two arguments is discussed in Part III. 

Part I consists of two sections. In section 1, we deal with monotonic functions, and in section 2, with func- 
tions that have a single maximum or a single minimum. 

2. Monotonic functions 
2. 1 The general formula 

Polynomials, which are widely used for empirical fitting, have well-known shortcomings for the fitting of 
monotonic functions: they often have undesirable maxima, minima, and inflection points. The formula we 
propose in this section applies to monotonic functions, with or without a single inflection point in the range 
over which the curve is fitted. The formula is 

y, =y„ + A(xt - x a ) \x, - x„\ B ~ u , -°°<x<+ «> (1) 
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where (x h y) are the coordinates of the points to be fitted by a monotone curve; and x ,y o ^i, and B are four 
parameters the values of which have to be estimated. Note that B need not be an integer. 
A simpler formula would be: 



y=y +A{x-x Y 



(2) 



but this formula presents difficulties for x<x , because of the ambiguity of defining (x — x„Y for negative 
values of x — x . Equation (1) is totally free of this shortcoming. We will refer to eq (1) as the "four 
parameter equation" for monotonic functions and denote it by the symbol MFP. 

2.2 Nature of the MFP function 

Table 1 presents the properties of the MFP function in diagrammatic form. The function is defined for 
the entire range x = — °° to x = 4-°°, Note that for B<o, the curve is discontinuous at x and consists 
essentially of two branches, each of which is free of points of inflection. Note also that in this case (B < o), 
the curve has finite asymptotes, equal to y , both at + oo and — oo . 

TABLE 1. MFP: Nature of Function 
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For 5>o, y becomes infinite both at x = — °° and x = +°° and has a point of inflection at x a . 

For B = 1, the curve becomes a straight line and for B = o, it becomes a constant It is worth noting that 
the curve is increasing v/henAB>o and decreasing when^5<o. 

It is apparent that by choosing appropriate portions of the curve, with the proper parameter values, great 
flexibility is available, and it may therefore be expected that the curve will provide good fits for many sets of 
empirical data representing monotone functions with no more than one point of inflection. This does not 
mean, of course, that it will provide satisfactory fits for all sets of monotone data. 

Figures (la) through (le) show some examples of curves that were generated by eq (1). The four 
parameters are given for each case. The figures demonstrate the flexibility that can be achieved through the 
use of this general formula. 

2.3 Method of fiHing 

The procedure we use for finding the four parameters consists of two steps: (a) finding initial values forx 
and B;(h) iterating, using the Gauss-Newton procedure, to improve these estimates. 
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Both steps are presented with a great deal of detail, in spite of the fact that standard procedures can be 
found in the literature for non-linear fitting. The reason for this is the intent to make this paper essentially 
self-contained. 

0ncex o and5 are known, define 

Z .=( x .- Xo )\ x .- Xo \ B-l ( 3 ) 

Then, eq (1) becomes: 

yi=y +-Azi (4) 

which is the equation of a straight line in y, versus z t , for which the intercept y and the slope A are readily 
estimated by linear regression. 

The variable z, can be regarded as a reexpression of x t in a transformed scale. The new scale must be such 
that z, is linearly related toy,. For the purpose of comparing different pairs of (x ,B) in achieving a good fit, 
a convenient measure is therefore the correlation coefficient between z,- and y,. We will use this measure 
throughout the paper with the understanding that it is merely a comparative measure for the adequacy of a 
{x ot B) pair of values, and that we are not concerned here with the statistical properties of this measure. 

2.4 Initial values for x„ and B 

Differentiating y with respect to x, in Eq (1), gives the relation: 

&=y' x =AB\x-x \»-> (5) 

Dividing eq (1) by eq (5) yields: 

y~y» _ x ~ x q 



B 



r=r.4 w "I w (6) 

Now x and y are given for TV points of the curve, andy' x can be approximately calculated for the midpoints of 
the intervals between successive ^-values. The value of y can also be estimated approximately at these mid- 
points. This yields N—\ sets of values x, y, and y x from which xy' x can be calculated. A multiple linear 
regression of y on xy' x and on y' x , allowing for the constant term y e , then gives estimates of the coefficients 

1 x 

y a , -jt and ■—■. We ignore the first and use the two others to estimate x and B. 

D D 



2.5 Illustrative example for finding initial values of x„ and B 

An important statistical application of empirical curve and surface fitting is to represent standard statis- 
tical tables by formulas that can be used for ready interpolation or for incorporation into computer pro- 
grams. Our first example is the two-tail 5 percent critical value of Student's t, for values of v, the degrees of 
freedom, ranging from 2 to «>. The data used for the fit consist of 20 selected pairs of (v, t c ), where t c is the 
critical value in question [2]. The data are given in table 2. We substituted 10,000 for v = °°. 

The calculations for the initial values estimation are shown in table 3. The midpoints of x and y are 
denoted x„ and y m . The derivative y x is approximated by Ay/Ax (column 5). For example: 



TABLE 2. Two-Tail 5 percent Critical Values of Student's 
t Statistic 

Degrees of Freedom (v) Student's t 

2.0 4.3027 

3 3.1825 

4 2.7764 

5 2.5706 

6 2.4469 

7 2.3646 

8 2.3060 

9 2.2622 

10 2.2281 

12 2.1788 

14 2.1448 

16 2.1199 

18 2.1009 

20 2.0860 

25 2.0595 

30 2.0423 

40 2.0211 

60 2.0003 

120 1.9799 

10000 1.9600 

TABLE 3. Calculations for Initial Values ofx and B. 
(Data in Table 2) 
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3.5 


2.97945 


- .406100 


-1.421350 


4. 


2.7764 


4.5 


2.67350 


- .205800 


- .926100 


5. 


2.5706 


5.5 


2.50875 


- .123700 


- .680350 


6. 


2.4469 


6.5 


2.40575 


- .082300 


- .534950 
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- .058600 


- .439500 
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- .372300 


9. 
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- .271150 


12. 


2.1788 


13.0 


2.16180 


- .017000 


- .221000 


14. 


2.1448 


15.0 


2.13235 


- .012450 


- .186750 


16. 


2.1199 


17.0 


2.11040 


- .009500 


- .161500 


18. 


2.1009 


19.0 


2.09345 


- .007450 


- .141550 


20. 


2.0860 


22.5 


2.07275 


- .005300 


- .119250 


25. 


2.0595 


27.5 


2.05090 


- .003440 


- .094600 


30. 


2.0423 


35.0 


2.03170 


- .002120 


- .074200 


40. 


2.0211 


50.0 


2.01070 


- .001040 


- .052000 


60. 


2.0003 


90.0 


1.99010 


- .000340 


- .030600 


120. 


1.9799 


5060.0 


1.96995 


- .000002 


- .010191 


10000. 


1.9600 
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Ax 
The multiple linear regression is carried out by regressing column (4) on a column consisting of unity for 
all rows (for the coefficient y ) and on columns (6) and (5). Thus, the first observational equation is: 

3.7426 = y (1) + jj- (-2.8005) - -|- (-1.1202) 
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The normal equations are (omitting the symmetrical elements below the diagonal in the x'x matrix): 



44.00215000 
-25.67782645 
-6.95277684 



1 _ £. 

B B 



19.0 -8.86179170 -2.15789201 

12.14223318 4.10274947 

1.49195207 



Solving these 3 equations in 3 unknowns, we obtain for the second and third coefficients: 



— = -.93471932 



- ±2. = .75296356 
B 



which give the estimates: 



x = .80555 

B = -1.069839 

Calculating the corresponding z t for all x t by eq (3) and regressing y t on z„ we obtain the estimates 

y = 1.96424556 

A = 2.82544745 

The correlation coefficient between z t andy,, for the pair (x = .80555, B = 1.069839), is 0.999993, indicat- 
ing a quite successful fit. We now develop a Gauss-Newton iteration process to achieve, if possible, an even 
better fit. 

2.6. Gauss-Newton iteration to improve the fit 

The iteration is carried out on the standardized vectors of y and z, defined as follows: J 

Vj s J^i where Sy = V?(;v,-7) 2 (7) 

Oy ' 

t t = -^f^". where 5, = Vf (z,-z ) 2 (8) 

Define: 

D, s v, - r,if E,»A>0 (9) 

D, s - Vi - t t \i^tVit t <Q (10) 

Then the equations yielding the corrections for x and B, denoted by Ax„ and AB, are: 

Di =-^Ax +%,AB (ID 

ox a oB 



'/„ as defined by eq (8) should not be confused with Students's t used for the illustrative example. 



The partial derivatives - a ' and ' are readily evaluated using eqs (3), (7), and (8). The results are: 
ox„ on 

-g. = ±.(r, - p - e.* da 

"Ib" = X w ' " Q~ F " ,) (13) 

where 

P, = -B\ Xi -x \ B - 1 (14) 

Q t = z t tn\x,-x.\ (15) 

P and () are the averages of the P t and the Q t , and 

^ = ZPiti (16) 

^■5^A (17) 

Using these equations, a ' and _„ are computed for all i and a multiple linear regression is carried out, 
ox on 

in accordance with eq (11), of D t on _ ' and ' . The coefficients are Ax and AB. 

ox OB 

To avoid "overshooting," it is advisable to correct x and B by only a fraction of Ax and AB, say (Az„)/4 

and (A£)/4. Thus, the new values for x and 5 are 

new x = x H j*- (18) 

4 

new 5 = 5 + -^- (19) 



Using these new values, z t is recalculated for all i and the entire process is repeated. Iteration continues 
until practical convergence is reached. 

Using the above equations, the entire procedure is readily programmed on the computer. The calcula- 
tions are simple and rapid. 

Referring to our illustrative example (table 2), and starting with the initial values x Q = .80555 and B = 
—1.069839, we obtain after 30 iterations: x = .836464 and B = —1.055240. The correlation between z and y 
is now 0.9999964. The fitted values are correct to within 2 or 3 units in the third place. Further iterations do 
not improve the fit and result in only minute changes in x and B. Table 4 lists the four parameters of the 
final fit and the fitted values. Figure 2 is a graph of the experimental points and of the fitted curve. 

2.7. Additional remarks 

Occasionally, the initial values for x and B, obtained by the method described in section 4 above, are 
unsatisfactory and the iteration process may fail to converge. One possible remedy is to interchange x and y 
in the formula. Indeed, the basic formula y=y +A(x— x ) B can be written: 

x=Xo+-^7w(y-y ) UB (20) 

indicating the same form for expressing* in terms of y as vice-versa. 

If this advice leads to a satisfactory fit, one can do further iterations in the original form (y as a function of 
x), using the values of x„ and B obtained by use of the inverted formula after a few iterations. 



TABLE 4. Fit of Data of Table 2 



Equation: y = 1, 963000 + 2.745822 (v - .836464)'" 

Degrees of Freedom (v) Tabular Value 

2 4.3027 

3 3,1825 

4 2.7764 

5 2.5706 

6 2.4469 

7 2.3646 

8 2.3060 

9 2.2622 

10 2.2281 

12 2.1788 

14 2.1448 

16 2.1199 

18 2.1009 

20 2.0860 

25 2.0595 

30 2.0423 

40 2.0211 

60 2.0003 

120 1.9799 

10,000 (oo) 1.9600 



Fitted Value 



4.3032 
3.1792 
2.7775 
2.5725 
2.4487 
2.3659 
2,3068 
2.2625 
2.2281 
2.1783 
2.1439 
2.1188 
2.0997 
2.0847 
2.0583 
2.0411 
2.0203 
2.0004 
1.9807 
1.9632 



« 3 1 



1 



Two-Tail 5% Critical Values of Student's t -distribution 



50 100 150 200 

Degrees of Freedom 

FlCURE 2. Two-tail 5 percent critical values of Student's t-distribution. 



It is also possible to simply "guess" initial values. It happens occasionally that after an initial tendency to 
diverge (decreasing correlation coefficient), the process suddenly reverses itself and leads to a good fit. 

As noted earlier, the general fitting equation given by (1), yields good results for many sets of monotonic 
data, but cannot be guaranteed to be satisfactory in all cases. 
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3. Non-AAonotonic functions 

3. 1 . The general formula 

We will deal with functions that present a single maximum or a single minimum in the range in which 
they are to be fitted. Polynomials of order 2 (quadratics) appear to be the most plausible candidates for such 
functions. The general equation for the quadratic is: 

y = a + bx + ex 2 (21) 

In practice, however, it is often found that even for functions presenting only a single maximum or a single 
minimum, eq (21) provides a very poor fit indeed. For example, the five points represented by the first two 
columns of table 5, when fitted by a quadratic, give the least squares fit of column 3 of that table. A fairly 
obvious device for improving the fit is to express the x variable in a transformed scale. To be effective, the 
scale transformation must be non-linear. A simple example is given by the relation: 

y = a + bx a + c(x a ) 2 (22) 

We will use a slight modification of eq (22), to allow for a frequently occurring case, namely the case in 
which a logarithmic transformation of x is indicated. 

TABLE 5. A Non-Monotonic Function 

x y Quadratic Fit of y" 

25 .513089 .503630 
10 .551180 .684080 
5 .516202 .329230 
1 .063119 -.104050 
05 -.403100 -.167547 

° Value y = a + bx + ex 2 , resulting from unweighted least squares fit. 
It can readily be shown that 

In x = lim xa ~ l (23) 



Thus, by writing: 



y = a + b ( *" 1 ) +c ( *" l ) 2 ; 0<x< oo (24) 



we obtain a general formula that includes the transformation of a; to its logarithm (for very small a) as well as 
all cases covered by eq (22) (for any other a). 

Equation (24) is the formula we propose for fitting non-monotonic functions of a positive valued argu- 
ment, with a single maximum or minimum. We refer to this equation as the Quadratic Four Parameter 
(QFP) formula. The unsolved problem is to find the value of a that provides the best fit by eq (24). But first 
we study the properties of the QFP function. 

3.2. Nature of the QFP function 

A diagrammatic presentation of the properties of the QFP function is given in Table 6. 
Note that the curve can be monotonic under certain conditions, namely for bal2c> 1. If not monotonic, 
the curve is finite at one end of the range of x and oo at the other end. Within this range, it passes through a 

local minimum or a maximum, which always occurs at x = (1 ^-) ,/a . Here again, as for the MFP, a 

2c 
great deal of flexibility is available. 



Table 6. QFP: Nature of Function. 
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2c 
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2c 





Value of y 


Value of x 


a>0 a<0 


a very close to zero* 


x=0 

X = x„° 

X = +00 


a-JL+c 
a a 

4c 

+ oo( c >0) 
-oo( c <0) 


+ oo( c >0) 
-oo(c<0) 

a -L+* 
a a' 


+ oo( c >0) 
-«(c<0) 

-S 

+ oo(c>0) 
-oo( c <0) 



• *„=(1 - &«) I/-. for a « o, x„ = e- b,u 

* In that case: y = a + blnx+ c(ln x) 2 



3.3. The fitting procedure 

Appropriate o-values are often found in the range —2 to +2. Since a computer program can readily be 
written that fits eq (24) for any given value of a, and since the calculations are quite rapid on any program- 
mable desk calculator, or minicomputer, a trial and error search procedure is a reasonable way to obtain an 
initial value for a. It is then easy to apply a Gauss-Newton iteration process to zero in on the best value for a. 

We propose the following procedures: 



1. Let 



(25) 



Then: 



u = 

a 



y = a + bu + cu 2 



(26) 



2. Choose a small set of values of a between —2 and +2. For each value of a, calculate u for each x, fit eq 
(26), and calculate the correlation coefficient q between y { and the corresponding fitted value y t . Also 
calculate: 



k =T 



(27) 



With a properly-written program, step 2 should take very little time on a programmable desk calculator 
and much less on a minicomputer. 

3. Having found a value of a that gives a g-value of 0.99 or better, in absolute value, 2 this a and the cor- 
responding k may be taken as the starting values for the iteration process, which is presently described: 

4. Equation (26) is written in the form: 



y=a+ b (u + ku 1 ) 



(28) 



1 For data of relatively low precision, it may not be possible to achieve q = .99. We are particularly concerned, in this paper, with data of high precision, 
such as tabulated mathematical functions, or high precision physical or chemical data. 
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5. Let 

z, = u, + kuf (i = 1 to N) (29) 

Then 

ji ■ = a + bzi (30) 

6. Using the same approach as for the monotonic four-parameter (MFP) curve fit, and defining v„ t it and 
Di in the same way(eqs(7) to (10)), we obtain, as before: 

-&- = $*-'-*"* (31) 

■%- ~ ±IQ. - Q - F-* (32) 

However, the quantities P and Q are now defined by the following relations: 

P t = 1 +2ku ' [(1 + autfn x, - u,] (33) 

a 

Qi = a? (34) 

The quantities E and F are defined, as before, by eqs (16) and (17). 
In the place of eq(ll), we now have: 

D, = -| k -Aa +-$£-Ak (35) 

da ok 

The regression of D t on . ' and - ' yields the "corrections" Aa and Ak. Again, it is advisable to use 

only a fraction, say one-fourth, of these quantities at each iteration. 

The iteration process is continued until further impovement of the fit becomes negligible. 

3.4. An illustration 

For illustration of the QFP process, let us return to the example of table 5. Trying first a few values 
between —1 and 1, one finds readily that a high correlation (g =.999263) is obtained for a = —0.1, with a 
corresponding k — .243550. Using these approximations as initial estimates for the iteration process, the 
latter rapidly converges to a q- value of 0.9999988, giving the final estimates (for eq(24)): 

a =-.198499 
a = .062839 
b = .522744 
c =-.140142 

The effectiveness of this fitting procedure can be judged by examining table 7, in which a number of simple 
transformations are compared with the one resulting from our fitting procedure. Note that the logarithmic 
transformation is represented by a=0; what is actually meant is that instead of the transformation to the 
logarithm, eq (23) could have been applied, using a very small value of a, such as 0.001 or 0.0001. It is 
apparent that a dramatic improvement in the fit results from using the value of a that is given by the itera- 
tion process. 

Contrary to what may be believed, the (x,y) data in table 5 were not "made up." They represent one of the 
eigenvectors obtained in the process of fitting a table of a statistical function of three arguments by an 
empirical formula (see part III). The x-values are the values of one of the three arguments: level of signifi- 
cance. 
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TABLE 7. Comparison of Qua dratic Fits 



- 




y fitted, using 


quadratic function of 




X 


y 

observed 


* 


Inx" 




X' 1 


„-0. 198499 


25 


0.513089 


0.503630 


0.495354 




0.560887 


0.513351 


10 


.551180 


.684080 


.578775 




.530048 


.550271 


5 


.516202 


.329230 


.525325 




.478823 


.517023 


1 


.063119 


-.104050 


.014030 




.076830 


.062839 


0.5 


-.403100 


-.167547 


-.372926 




-.406150 


-.402995 




a 


1 







-1 


-.198499 




e 


.89630 


.99675 




.99687 


.999999 



* In x corresponds to a=o. See text. 

3.4. An illustration from the physical sciences and a simple stratagem 

Table 8 is taken from a tabulation of the density d, 3 in grams per milliliter, of ordinary water, for values of 
temperature ranging from —5 to 30 °C [2]. 4 For ease of calculations, we use the coded values y = 10" 6 (d — 
.998). It is well known that this property has a maximum in the vicinity of 4 °C. We consequently try to fit 
the data by the Quadratic Four Parameter (QFP) formula. 

TABLE 8. Density of Ordinary Water as a Function of Temperature 



Temperature 
(°C) 



Density 
(in 7-units ) 



Temperature 


Density 


(°Q 


(in y-units°) 


5 


1992 


6 


1968 


7 


1930 


8 


1877 


9 


1809 


10 


1728 


15 


1129 


20 


234 


25 


-925 


30 


-2322 



-5 

-4 

-3 

-2 

-1 



1 

2 

3 

4 



1283 
1441 
1578 
1694 
1790 
1868 
1927 
1968 
1992 
2000 



•y = 10" (d - 0.998) 

We run into a minor difficulty in that our range of x-values includes negative values. This can be remedied 
in this case by taking as the independent variable x — t + ft, where fi is chosen so as to make all x— values 
positive. We use/? = 10 for our example. Thus: 

x= t + 10 
y= 10 6 (d- 0.998) 

Trial of a few values for a shows that a = 0.8 gives a fit with a correlation coefficient of 0.999967 (whereas 
for a = 1, which corresponds to a "no transformation" quadratic fit, the correlation is 0.99894). The cor- 
responding k value is k = - .055752. Application of the iteration process leads rapidly to the best parameter 
values: 



a = .824103 
a =242.267451 
6=373.749212 
c = 19.881303 
with a correlation q = .999990. 



' To BToid confusion with the corelation coefficient, we use the symbol d, rather than the conventional <j for density. 
4 Figures in brackets refer to literature references located at the end of each of the three parts of this paper. 
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Table 9 presents a comparison of the simple quadratic fit (no transformation) with the QFP. For ease of 
evaluation, the tabulated value is the residual y—y (fitted minus observed). 

The QFP can reliably be used for interpolation. Use of an empirical formula for extrapolation is of course 
always risky and should be done if at all, with great caution. The usefulness of the formula for interpolation 
is demonstrated in table 10, which shows both fits for values of x not included in the fitting process (but 
within the range of the x values used for the fit). 



TABLE 9. Comparison of QFP and QLS Fits for Data of Table 6" 



Temperature 
(°C) 


Residual 


Temperature 
(°C) 


Residual 


QFP 


QLS 


QFP 


QLS 


-5 


-10 


93 


5 


-3 


-35 


-4 


-1 


53 


6 


-3 


-26 


-3 


4 


20 


7 


-4 


-17 


-2 


6 


-4 


8 


-5 


-5 


-1 


6 


-22 


9 


-4 


7 





5 


-35 


10 


-4 


20 


1 


3 


-43 


15 


2 


77 


2 


2 


-45 


20 


8 


97 


3 





-45 


25 


8 


47 


4 


-1 


-41 


30 


-7 


-97 



"QLS = quadratic least squares fit (unweighted) 



TABLE 10. Comparison of QFP and QLS Fits for Interpolation 



Temperature 
(°C) 


Residual 


Temperature 
(°C) 


Residual 


QFP 


QLS 


QFP 


QLS 


11 

12 
13 
14 
16 
17 
18 
19 


-3 
-2 
-1 
1 
4 
5 
6 
7 


32 
45 
57 
68 
86 
92 
96 
97 


21 
22 
23 
24 
26 
27 
28 
29 


9 
9 
9 
8 
5 
3 
1 
-4 


94 

86 

77 

64 

26 

2 

-26 

-60 



4. Conclusion for Part I 

We have presented two formulas for the empirical fitting of functions of a single argument. The first 
applies to monotonic functions; the second can be used for monotonic functions under certain conditions, 
but its main use is for functions that have a single minimum or a single maximum. These formulas turn out 
to be useful also in the fitting process of functions of two or more arguments, as will be shown in Parts II and 
III of this paper. The fitting of the two functions is straightforward and can be readily programmed on com- 
puters and even on programmable calculators. 



5. References 

[1] Symbols, Definitions and Tables for Industrial Statistics and Quality Control, Rochester, Institute of Technology, Rochester, N.Y. 

(1958). 
[2] Handbook of Chemistry and Physics, 56th Edition, CRC Press, Cleveland, Ohio (1975-1976). 
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Part II: Fitting Functions of Two Arguments 
1. Introduction 

Let Zjj be a function of two arguments x t and y 5 . The form of the function is unknown, but a set of data is 
available in the form of a rectangular array, in which the row "labels" are x t and the column "labels" yj. 
The tabulated value corresponding to x t and y-j is z, 7 . We assume that each cell of this two-way table is 
"filled"; i.e.: there are no cells for which the value of z i;/ is missing. 

An example will clarify matters. Table 1 is a portion of the table of 5 percent critical values of the "Stu- 
dentized Range" [1]. This is a relatively short portion of the complete table but will suffice for illustration of 
the method, and will help to show its power. 

The principle of the procedure is to first find the "Singular Value Decomposition" (SVD) [2] * of the 
matrix representing the two-way table, using only the z u values (but not the x-, and the y J), and then to relate 
the parameters of the SVD to the x- t and the yj, using the methods of part I of this paper. 

The SVD [2] is a technique for developing the following relation: 

z u = 6 iu u vij + Qiu 2i v v +... + B p u pi v P j (1) 

or, more compactly: 

p 
Zij = E 6 k u ki v kJ (2) 

where the#* are positive constants and the u ki and v kJ are vectors such that 

E ul, = E vlj = 1, for all k (3a) 

' j 

and 

E u ki u k . t = E v v v k .j = 0, for k * k' (3b) 

It can be shown that p is equal to the rank of the z-matrix, and this rank is, in turn, equal at most to the 
number of rows or the number of columns of the table, whichever is smaller. The 9 k are the square roots of 
the "eigenvalues" of the z T z matrix, where z T denotes the transpose of the matrix z, and the v kj are the 
corresponding "eigenvectors." The u ki are the eigenvectors of the zz T matrix. The 0's are called the 
singular values of the matrix z. 

2. The SVD technique 

Algorithms for finding the SVD are readily available and a number of computer programs have been writ- 
ten for this purpose [14,15,16,17]. 

In most cases, it is not necessary to consider all/) terms on the right hand side of eq (1). In fact, the first 2 
or 3 terms are often sufficient to give an excellent approximation for the z ti . From a practical viewpoint, it is 
easy to judge at what point the SVD can be terminated. 

We illustrate the procedure with the data of table 1. Denote the residuals, after fitting the first q terms on 
the right-hand side of eq(2), by(d ) q . Thus: 

(<i l7 ), = Zij - t E 6 k u ki v kJ (4) 

It can be shown that 



1 The SVD procedure is intimately related to the Method of Principal Components [3]; its use, in either of these two forms for the analysis of two-way 
tables of data has been discussed in a number of places [4,5,6,7,8,9,10,11,12,13]. Figu res ;„ brackets refer to literature references, listed on page 19. 
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ij ij k=l 



(5) 



For the data of table 1, we find 



t = 38.034666; 6 2 = 1.392726; 9 3 = .064557 



and 



Applying eq(5), for q = 3, we find: 



££ zl = 1448.579715. 



EE (d„)U = 0.000044 



This sum of squares of residuals applies to 42 observations; the residuals are not independent Nevertheless, 
we can calculate an "average square residual per observation," 2 which will be equal to 



0.000044 
42 



= 1.05 X 10" 6 = (0.0010) 3 



Thus, after fitting the first three terms of the SVD, the residuals will be of the order of 0.001. Since the data 
are given with 3 decimals, such a fit is, for most practical purposes, quite satisfactory. 
Table 2 lists the 9 values and the eigenvectors u, and Vj for the first three terms. 

TABLE 1. Five Percent Critical Values of the Studentized Range 



\- 


2 


3 


5 


10 


20 


60 


100 


4 


3.927 


5.040 


6.287 


7.826 


9.233 


11.240 


12.090 


8 


3.261 


4.041 


4.886 


5.918 


6.870 


8.248 


8.843 


20 


2.950 


3.578 


4.232 


5.008 


5.714 


6.740 


7.187 


40 


2.858 


3.442 


4.039 


4.735 


5.358 


6.255 


6.645 


120 


2.800 


3.356 


3.917 


4.560 


5.126 


5.929 


6.275 


OO 


2.772 


3.314 


3.858 


4.474 


5.012 


5.764 


6.085 



TABLE 2. 6-Values and Eigenvectors for Studentized Range (table 1) 



V 


"l 


«2 


«3 


n 


*>i 


»* 


"3 


4 


.586984 


-.684908 


-.393566 


2 


.199994 


.550013 


.514706 


8 


.439355 


-.132755 


.588796 


3 


.246064 


.489632 


.178017 


20 


.366344 


.194757 


.446619 


5 


.296980 


.365874 


-.212600 


40 


.343350 


.317316 


.117282 


10 


.357146 


.167917 


-.490644 


120 


.328090 


.407791 


-.239725 


20 


.411696 


-.032030 


-.447995 


00 


.320429 


.456467 


-.477211 


60 


.489830 


-.320941 


.071843 










100 


.523336 


-.437783 


.459993 



0, =38.034666, 0, = 1.392726, 0, = .064557 



3. Fitting the structural parameters 

Our next task is to express the u, and Vj (which we can call the structural parameters) as functions of v and 
n respectively. This is a curve fitting problem, and can be attacked by the methods developed in part I of 
this paper. 



' No attempt is made to compute a "standard deviation of fit" with an appropriate number of degrees of freedom. The fit is purely empirical and is not 
based on a mathematical model; the proposed "average square residual per observation" is to be understood in that spirit 
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The method of section 2 of part I is used for eigenvectors that are monotonic functions of their argu- 
ments, while section 3 is applied to eigenvectors that present a single maximum or minimum. In general, the 
fit for the u 3i and v 3J vectors need not be as good as those for the first two sets of vectors, since the third 
term, having a small multiplier 6 3 , contributes only a small part to the SVD. 

The entire computational procedure is summarized in Table 3 which lists the fitted vectors u ki and v kj 
obtained, for k = 1 and 2, by the MFP procedure and for k = 3, by the QFP fitting procedure. Table 4 lists 
the fitted values, which may be compared to the values in table 1. 



TABLE 3. Parameters of Fits 



MFP Fits 



Vector 


X 


y» 


A 


B 


"i 


.981602 


.320232 


.766066 


-.954958 


"i 


.569248 


1.335141 


-1.167675 


-.079057 


"i 


.523691 


.459599 


-3.326912 


-.856578 


»i 


-5.477500 


-.964295 


3.392462 


-.398982 



QFP Fits 



Vector 


a 


a 


6 


c 


«3 


-.776878 
-.032413 


-24.645495 
1.449792 


47.554916 
-1.543742 


-22.368740 
.307466 



TABLE 4. Fitted Values for Data of Table 1 



\ n 
v \ 


2 


3 


5 


10 


20 


60 


100 


4 


3.923 


5.044 


6.291 


7.823 


9.231 


11.242 


12.089 


8 


3.261 


4.038 


4.889 


5.921 


6.867 


8.247 


8.848 


20 


2.954 


3.572 


4.233 


5.012 


5.712 


6.733 


7.183 


40 


2.862 


3.435 


4.040 


4.739 


5.358 


6.253 


6.648 


120 


2.804 


3.348 


3.917 


4.564 


5.126 


5.930 


6.283 


OO 


2.776 


3.308 


3.859 


4.477 


5.007 


5.755 


6.081 



4. Interpolation 

The total number of parameters used for the fit is 27: four for each of the two vectors in each of the three 
terms of the SVD, in addition to the three 0-values. The actual number is less, since each 6 can be incor- 
porated into the coefficients of one of the two corresponding vectors, and further simplification is possible 
when the entire equation is algebraically reduced. However, this is unimportant for two reasons. 

In the first place, for purposes of programming the calculations, any additional manipulation to reduce 
the number of parameters is unnecessary. 

Secondly, and this is an important point— the often-made assertion that "it is absurd to fit a set of data 
with as may parameters, or almost as many parameters as there are data" can not be justified. The fact is 
that by fitting the 42 values of table 1, we have obtained a formula that fits the 5 percent critical value table 
of the Studentized Range for all values of v from 4 to infinity and all values of nfrom 2 to 100, using only 27 
parameters. That this indeed so, can be verified by applying our fitting algorithm to any pair of v and n 
values (v = 4 to oo, n = 2 to 100). 

The values of Table 5 illustrate this point for some selected pairs of values. 

The procedure we propose is in many cases, a powerful and reliable interpolation algorithm 
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TABLE 5. Examples of Fitted Values for (v,n) Not Included in Fit." 





Studentized Range 


V 


n 


Tabular 


Fitted 


3 


3 


5.910 


5.945 


3 


8 


8.853 


8.863 


3 


50 


13.360 


13.352 


6 


4 


4.896 


4.897 


6 


12 


6.789 


6.789 


6 


70 


9.370 


9.375 


13 


2 


3.055 


3.059 


13 


17 


5.931 


5.929 


13 


90 


7.667 


7.663 


60 


4 


3.737 


3.734 


60 


50 


5.958 


5.956 


60 


80 


6.303 


6.305 


500 


3 


(3.314) 


3.317 


700 


40 


(5.498) 


5.517 


1000 


90 


(6.020) 


6.042 



'Values in parentheses correspond to v values not found in tables. The values in parentheses are for v= oo. 



An Example from the Physical Sciences 

Table 6 lists values of the quantity (n 2 — l)/(n 2 +2) for benzene at various values of pressure and 
wavelength. The values were derived from table 1 of ref. [18]. The symbol n represents refractive index. The 
function (n 2 — l)/(n 2 +2) is chosen, in prefernce to n, because of the Lorentz-Lorenz equation: 



&H=/w 



(6) 



where D represents density and A wavelength. Since for a fixed mass, the density is a function of pressure, 
we can replace D by $(P), where P denotes pressure. Equation (6) can then be written as: 



=±- = <f>(P)-f(\) 



JL 
n 2 +2 



(7) 



According to this equation, the quantity (n 2 — l)/(ra 2 +2) is a multiplicative function of two factors, one 
depending on pressure only, and the other on wavelength only. It can be shown that if this is true, all but the 
first term of SVD of table 6 represent merely experimental error. The first term of the SVD, on the other 
hand represents the quantity 0(P)»/(X). 

TABLE 6. Refractive Index of Benzene at 34.5 °C as a Function of Pressure and Wavelength 









Tabulated Value =- 


n*-l 
n'+2 








Wavelength 


Pressure 


6678 


6438 


5876 


5086 


5016 


4922 


4800 


4678 


1. 


0.287528 


0.288222 


0.290224 


0.294366 


0.294869 


0.295531 


0.296490 


0.297546 


246.2 


.293514 


.294242 


.296296 


.300494 


.301012 


.301707 


.302652 


.303743 


484.8 


.298497 


.299225 


.301278 


.305558 


.306072 


.306737 


.307738 


.308830 


757.2 


.303242 


.303964 


.306057 


.310383 


.310917 


.311597 


.312595 


.313713 


1107.7 


.308426 


.309152 


.311276 


.315688 


.316232 


.316921 


.317930 


.319058 
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Eventhough our procedure makes no pretenses to anything but empirical fitting, this set of data provides 
us with an opportunity to examine the agreement between a physical theory (the Lorentz-Lorenz relation) 
and a set of experimental data. 

The first three 6 values of the SVD of table 6 are: 

0, = 1.92330513; 2 = 0.00009801; 9 3 = 0.00002689 

Note the very large drop from 0i to B z , indicating that one multiplative term in the SVD should represent 
the data quite welL More exactly, we find: 

y y 

= 3.699102657 -(L92330513) 2 = 3.4xl0~ 8 

The average square residual per observation is 



3.4 XlO' 8 _ 



40 



= 8.5 X 10" 10 = (2.9 X 10" 5 ) 3 



Thus, one single multiplicative term reproduces the data of table 6 to about 3 units in the 5th place. It is 
easily verified that addition of a second multiplicative term fails to significantly improve this fit The preci- 
sion of a measurement of n in this study is no better than 1 to 3 units in the fifth place [18]. Applying the law 
of propagation of errors, it is easily seen that the same statement holds for the quantity (n 2 -l)/(n 2 +2). 
We now have the model: 



2 1 

-*4 — ~r- = OuiVj + error 



(8) 



where «, is a function of pressure only, and Vj a function of wavelength only. Thus, eq (8) is equivalent to eq 
(7), as required by the Lorentz-Lorenz theory. 

The fit of Uj as a function of pressure and Vj as a function of wavelength can be accomplished by the four- 
parameter curve. Table 7 lists the parameters of the two curves as well as the fitted values, using eq (8). A 
comparison of tables 6 and 7 confirms the satisfactory quality of the fit 

TABLE 7. Fitted Values for Data of Table 

Equation: n )~* . = 1.923305 [.671861 - 1.483692(P + 1480.809979)" a " 9(M9 ] . [.333889 + .091736 (-^ 1.817095)-' I37982 ] 

Wavelength 



Pressure 


6678 


6438 


5876 


5086 


5016 


4922 


4800 


4678 


1 


0-287552 


0.288247 


0.290238 


0.294364 


0.294840 


0.295519 


0.296470 


0.297513 


246.2 


.293563 


.294272 


.296305 


.300517 


.301003 


.301696 


.302667 


.303732 


484.8 


.298467 


.299188 


.301255 


.305537 


.306032 


.306736 


.307723 


.308806 


757.2 


.303229 


.303961 


.306061 


.310412 


.310915 


.311630 


.312633 


.313733 


1107.7 


.308387 


.309133 


.311268 


.315693 


.316204 


.316931 


.317952 


.319070 



5. Conclusion for Part II 

By combining the Singular Value Decomposition technique with the curve fitting procedures developed 
in part I, it is possible to obtain excellent empirical fits for many sets of data in which the dependent 
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(response) variable is displayed as a two-way table and the rows and columns represent levels of the two in- 
dependent (regressor) variables, respectively. 

The procedure consists in performing an SVD on the matrix of values of the response variable and then 
fitting the vectors of parameters, which are functions of the rows or of the columns, but not of both, to the 
corresponding regressor variables. 
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Part III: Fitting Functions of Three 
or More Arguments 

1. Introduction 

The first two papers in this series (Parts I and II) dealt with ordinary curve and surface fitting, i.e., with 
the fitting of functions of one or two arguments. In the latter case, it was assumed that the data were in the 
form of a two-way table with no cells missing. Similarly, we will assume in this paper, that each value of the 
function to be fitted is associated with a combination of the levels of three or more arguments, all combina- 
tions being present, and each one being associated with a single value of the function. In other words, we 
assume a "complete factorial" with no replications per cell. Of course, if one or more cells contain more 
than a single observation, one can substitute the average for these replicates. For purposes of empirical 
fitting, this should be quite acceptable, provided the precision of the single observations is satisfactory. 

We present the method in terms of a single example, a function of three arguments. Generalization to 
functions of more than three arguments should be self-evident However, the method may become cumber- 
some, and is not recommended as a first choice in these cases. 
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2. Illustration: Fitting the F table 

Table 1 is a portion of the table of critical values of the F distribution for the levels of significance P, of 
25, 10, 5, and 1 percent, and for degrees of freedom, both in the numerator and in the denominator, of 4, 6, 
60, 120, and °°. The table, taken from ref. [1], has 100 "observations," but covers an infinite range of both 
sets of degrees of freedom, Vi and v%. We fully intend the empirical fit to be acceptable over this doubly- 
infinite range, and for all values of P between 1 and 25 percent. 



TABLE I. F-Table. Data for Fit 







P — Level of Significance, 
P 


in percent 




4 


4 


25 
2.06 


10 
4.11 


5 
6.39 


1 
15.98 




6 


1.79 


3.18 


4.53 


9.15 




60 


1.38 


2.04 


2.53 


3.65 




120 


1.37 


1.99 


2.45 


3.48 




00 


1.35 


1.94 


2.37 


3.32 


6 


4 


2.08 


4.01 


6.16 


15.21 




6 


1.78 


3.05 


4.28 


8.47 




60 


1.35 


1.87 


2.25 


3.12 




120 


1.33 


1.82 


2.17 


2.96 




00 


1.31 


1.77 


2.10 


2.80 


60 


4 


2.08 


3.79 


5.69 


13.65 




6 


1.74 


2.76 


3.74 


7.06 




60 


1.19 


1.40 


1.53 


1.84 




120 


1.16 


1.32 


1.43 


1.66 




OO 


1.12 


1.24 


1.32 


1.47 


120 


4 


2.08 


3.78 


5.66 


13.56 




6 


1.74 


2.74 


3.70 


6.97 




60 


1.17 


1.35 


1.47 


1.73 




120 


1.13 


1.26 


1.35 


1.53 




OO 


1.08 


1.17 


1.22 


1.32 


OO 


4 


2.08 


3.76 


5.63 


13.46 




6 


1.74 


2.72 


3.67 


6.88 




60 


1.15 


1.29 


1.39 


1.60 




120 


1.10 


1.19 


1.25 


1.38 




OO 


1.00 


1.00 


1.00 


1.00 



3. Approach 

The method is simply stated. First, we combine two of the three factors, in this case Vi and v 2 , pretending 
it to be, for the time being, a single factor. We then apply the SVD (singular value decomposition) to the 
two-way table thus obtained. Each eigenvector will be a function of either P or of the combination of a 
particular v, and a particular v 2 . This latter type of eigenvector is then entered into a two-way table, as a 
function of v, andv 2 . This two-way table is, itself, subjected to a SVD, with resulting eigenvectors that are 
functions of v, or v 2 , taken singly. The problem is thus reduced to the fitting of a number of curves (func- 
tions of a single argument). 

4. Details of the fitting process 

1. First step: SVD of 25 X 4 table. 

The 25 rows are the combinations of the five levels of v, and the five levels of v 2 ; the four columns repre- 
sent the four levels of the factor P (see table 1). 
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The SVD of this table, carried out to three terms, is represented by 1 

Jit = 6iU u Vu + 02U 2 ,t>2r + d 3 U3iV 3 , 

with the values (see Parts I and II for notation and terminology): 

LLyl = 1939.001 

it J 

0! = 43.70505824 

2 = 5.34641595 

03 = .53288922 



From these values we derive: 



L£(d it )l = 7.495692 X 10" 



which gives an average square residual per observation of: 

7.495692x10 



5x5x4 



= 7.495692 x 10" 6 = (.00274) 2 



(1) 

(2) 
(3) 
(4) 
(5) 



Thus, the fit will be good to approximately 3 units in the third place, provided that all the eigenvectors are 
fitted to an equivalent degree of approximation. 

2. Second step: Fitting the v vectors 

All v vectors are functions of a single variable, P, as shown in table 2. They are readily fitted by the meth- 
ods of Part 1, with the results shown in Table 3. 



TABLE 2. v-Vectors as Functions of P 



25 

10 

5 

1 



0.161663 


0.567164 


0.750537 


.275861 


.537558 


-.155183 


.395108 


.412703 


-.613514 


.861193 


-.468007 


.190293 



Table 3. Fit of v-Vectors 



Vector 



a) MFP Fits 

y = y + A(x — x ) |x-Xo| B ~' 

where x = P; y = v 

7o 



0.104080 
-6.628102 



-0.102867 
.568912 



0.923714 
-9496.645 



-0.388978 
-4.489745 



b) QFP Fit 



y = a + b(- 



■) + c(- 



where * = P; y = v 



Vector 


a 


a 


b 


c 


v» 


-0.237709 


0.190293 


-1.847334 


0.931911 



'" The critical F values are temporarily represented by the symboly,,. 
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3. Third Step: SVD of the u vectors 

Each u vector is a function of v, andv 2 as shown by tables 4, 5, and 0. 

To avoid confusion, we will denote the eigenvectors resulting from the SVD of each u vector by the sym- 
bols Aj and B k for u„ C, and D k for u 2 , and Ej and G k for u 3 . We find that, to obtain sufficient precision, the 
SVD for u u requires three terms, while for u 2 and u 3 two terms suffice; thus: 2 



ui = T ll A lJ B lk + T x2 A v B 2k + T l3 A 3J B 3 
u 2 = T 2X C xj D lk + T 22 C v D 2k 
u 3 = T 3i EijG ik + T 32 E v G 2k 



(6) 
(7) 
(8) 



TABLE 4. u t -Vector as a Function ofv, and v 2 



*x" 


4 


6 


60 


120 


CD 


4 


0.406210 


0.388401 


0.352024 


0.349916 


0.347548 


6 


.247943 


.231426 


.196782 


.194521 


.192350 


60 


.112774 


.098616 


.063326 


.060227 


.056489 


120 


.108349 


.094350 


.058259 


.054485 


.050072 


00 


.104083 


.090175 


.052868 


.048419 


.038755 



TABLE 5. UfVector as a Function of Vi and v-i 



p\'« 


4 


6 


60 


120 


co 


4 


-.273800 


-.232084 


-.153928 


-.149371 


-.144944 


6 


.058346 


.084441 


.132782 


.135561 


.139113 


60 


.227296 


.231801 


.224040 


.221888 


.218938 


120 


.229914 


.232483 


.220851 


.216840 


.212030 


00 


.230595 


.233936 


.216705 


.210834 


.196284 



TABLE 6. Uj-Vector as a Function of v, and Vi 



«*?' 


4 


6 


60 


120 


CO 


4 


0.054105 


0.101229 


0.149334 


0.154646 


0.159300 


6 


-.352902 


-.284130 


-.137823 


-.118086 


-.109860 


60 


-.259816 


-.119461 


.163908 


.180098 


.215083 


120 


-.227942 


-.098102 


.195804 


.216706 


256403 


CO 


-.206583 


-.088254 


.221559 


.247171 


.323018 



4. Fourth Step: Fitting the vectors A, B, C, D, E, and G. 

The vectors 4 C, and E are functions ofv, only, while B, D, and G are functions of v 2 only, as shown in 
table 7. Again, we use the methods of Part I to fit these vectors to their corresponding arguments, with the 
results shown in table 8. 

5. Fifth step: Fit of F as a function of P, v u and v 2 . 

By substituting for u„ u 2 , and u 3 in eq. (1), their expressions as given by eqs (6), (7), and (8), one readily 
obtains an expression for y„ as a function of quantities that are either constants (the 6 and the T), or func- 
tions of a single argument (P, v u and v 2 ). Since the latter have all already been fitted in terms of their 
respective arguments, the problem is solved, except for the routine multiplications and additions involved in 
eqs (1), (6), (7), and (8). A program can readily be written to obtain the value of y„, that is, of F, for any v u v 2 , 
and P, using eqs (1), (6), (7), (8) and the MFP or QFP fits shown in Tables 3 and 8. 



'The square roots of the eigenvalues are represented by the letter T, to avoid confusion with the of eq. (1). 
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TABLE 7. SVD Vectors as Functions of v t or v% 
(Eqs. (6), (7), (8)) 



Vi 


A, 


A, 


A 3 


C 


c 2 


E, 


E 2 


4 


.828136 


-.362993 


-.216246 


-.436484 


-.712491 


.220226 


.407258 


6 


.479297 


.070208 


.361271 


.243432 


-.673184 


.053854 


-.898992 


60 


.179612 


.482627 


.431289 


.506592 


-.166174 


.511488 


-.141436 


120 


.168194 


.520863 


.266325 


.501735 


-.105627 


.554120 


-.025071 


00 


.154640 


.599231 


-.752183 


.491779 


-.020205 


.616385 


.072944 


Vl 


B, 


B 2 


B 3 


D, 


D 2 


G, 


G 2 


4 


.511101 


.609203 


.238929 


.481716 


.687371 


-.473303 


.661606 


6 


.481313 


.367238 


-.143676 


.475017 


.314111 


-.195838 


.558564 


60 


.416240 


-.301559 


-.508432 


.434081 


-.344513 


.426098 


.311762 


120 


.411520 


-.379435 


-.342844 


.426716 


-.377055 


.471649 


.281945 


00 


.405595 


-.509016 


.739069 


.414481 


-.409871 


.577595 


.271280 








Table 8. Fit of 


Vectors of Table 7 









Vector 



a) MFP Fits 

y = y + A(x—x )\x-Xo\"~ 1 
where x = Vi or v 2 , y — vector fitted 



Xo 



A, 
A 2 
B, 
B 2 
D t 
D 2 
G t 
G 2 



2.333242 


.154595 


1.080698 


-0.925484 


3.160494 


.609721 


-.893793 


-.483655 


-.251639 


.405481 


.366518 


-.859647 


-.228726 


-.518182 


2.767959 


-.622996 


-9.389754 


.414477 


.470463 


-.749928 


-.114913 


-.409594 


4.845992 


-1.050171 


.678625 


.581829 


-2.303516 


-.650175 


- 1.463892 


.269022 


2.026138 


-.966873 




b) QFP Fits 








y = a + b(^-) + c(^-y 







where x = Vi\ y = vector fitted 



Vector 



A 3 
B, 
C, 
C 2 

E 2 



-.428299 


-6.868927 


9.408257 


-2.907025 


-.325922 


3.754828 


-4.383540 


1.108433 


-1.196614 


-34.856565 


87.329516 


-53.883595 


-.830262 


3.094441 


-9.025929 


5.346705 


-1.010583 


15.626475 


-37.446675 


22.511319 


-1.126615 


93.240278 


-235.519021 


147.086636 



5. Results 

Table 9 shows the fitted value for each entry of table 1. As expected, the fit is very good. The sum of 
squares of the residuals for 100 values is 0.0109. This gives a root mean square deviation per value of 
0.0104. 

6. Interpolation 

We mentioned earlier that the final fit should be adequate not only in reproducing the values of the ori- 
ginal table, but also as an interpolation formula. In table 10, a comparison is made between values of F as 
given by the Biometrika Tables, and those given by our empirical fit, for combination ofP, v x , and v 2 not 
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included in table 1 (the basis for our formula). Note, in particular, the values for P = 2.5 percent, a level that 
was totally absent from table 1. 

The sum of squares of residuals for these 80 values is 0.01812. Thus, the root mean square deviation 
per value is 0.048. Interpolation would of course be better if a larger table of F values had been used 
for the fitting process. 









Table 9. F Table— i 


7 itted Values 








P 




Vi 


Vl 


25 


10 


5 


1 




4 


4 


2.07 


4.11 


6.40 


15.98 






6 


1.78 


3.18 


4.52 


9.15 






60 


1.39 


2.04 


2.54 


3.66 






120 


1.37 


1.98 


2.45 


3.47 






00 


1.36 


1.95 


2.39 


3.32 




6 


4 


2.07 


4.00 


6.16 


15.21 






6 


1.79 


3.07 


4.29 


8.47 






60 


1.35 


1.87 


2.26 


3.12 






120 


1.32 


1.80 


2.16 


2.95 






00 


1.31 


1.76 


2.10 


2.82 




60 


4 


2.07 


3.78 


5.69 


13.66 






6 


1.75 


2.77 


3.74 


7.06 






60 


1.19 


1.40 


1.54 


1.84 






120 


1.15 


1.32 


1.42 


1.65 






00 


1.11 


1.23 


1.30 


1.47 




120 


4 


2.08 


3.78 


5.66 


13.56 






6 


1.74 


2.74 


3.70 


6.96 






60 


1.17 


1.35 


1.47 


1.73 






120 


1.13 


1.26 


1.35 


1.53 






00 


1.08 


1.16 


1.21 


1.31 




00 


4 


2.09 


3.77 


5.64 


13.46 






6 


1.73 


2.72 


3.66 


6.89 






60 


1.15 


1.29 


1.38 


1.60 






120 


1.10 


1.18 


1.23 


1.36 






00 


1.01 


1.02 


1.00 


.95 











Table 10. F Table— 


Interpolated Values" 












fl%) 


P 




25 


10 


5 


2.5 


1 


*i 


Vl 


Tab. 


Fit 


Tab. Fit 


Tab. Fit 


Tab. 


Fit 


Tab. 


Fit 


5 


5 


1.89 


1.87 


3.45 3.41 


5.05 4.99 


7.15 


7.06 


10.97 


10.93 




15 


1.49 


1.53 


2.27 2.33 


2.90 2.98 


3.58 


3.59 


4.56 


4.61 




30 


1.41 


1.42 


2.05 2.07 


2.53 2.57 


3.03 


3.00 


3.70 


3.73 




40 


1.39 


1.39 


2.00 2.00 


2.45 2.47 


2.90 


2.87 


3.51 


3.53 


15 


5 


1.89 


1.86 


3.24 3.20 


4.62 4.57 


6.43 


6.34 


9.72 


9.69 




15 


1.43 


1.48 


1.97 2.06 


2.40 2.51 


2.86 


2.89 


3.52 


3.57 




30 


1.32 


1.34 


1.72 1.76 


2.01 2.06 


2.31 


2.28 


2.70 


2.72 




40 


1.30 


1.31 


1.66 1.68 


1.92 1.95 


2.18 


2.14 


2.52 


2.53 


30 


5 


1.88 


1.85 


3.17 3.14 


4.50 4.44 


6.23 


6.13 


9.38 


9.34 




15 


1.40 


1.45 


1.87 1.96 


2.25 2.35 


2.64 


2.67 


3.21 


3.26 




30 


1.28 


1.30 


1.61 1.64 


1.84 1.89 


2.07 


2.05 


2.39 


2.40 




40 


1.25 


1.27 


1.54 1.56 


1.74 1.77 


1.94 


1.90 


2.20 


2.21 


40 


5 


1.88 


1.85 


3.16 3.12 


4.46 4.40 


6.18 


6.07 


9.29 


9.25 




IS 


1.39 


1.43 


1.85 1.93 


2.20 2.30 


2.59 


2.60 


3.13 


3.17 




30 


1.27 


1.29 


1.57 1.61 


1.79 1.84 


2.01 


1.98 


2.30 


2.32 




40 


1 .24 


1 25 


1.51 1.53 


1.69 1.72 


1.88 


1.83 


2.11 


2.12 



' Tab. = tabulated, from [1] 
Fit = fitted by the procedure of this paper, 
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The total number of parameters is the sum of 10 (one for each 9 or 7), and 4x17 (four for each of the 
three eigenvectors v, and four for each of the 14 vectors occuring in eqs (5), (6), and (7)); i.e., 78. As men- 
tioned in Part II, this number can be somewhat reduced through algebraic manipulation, but this is unnec- 
essary for a fitting process carried out on a programmable calculator or on a computer. 

We finally repeat our previous assertion (see also Part II) that these 78 parameters fit not only a table of 
100 observations (Table 1), (which would be a waste of time) but actually any F value, for P between 1 and 25 
percent, and for Vi and v 2 between 4 and oo. 

7. Conclusion for Part III 

Through repeated application of the procedure given in Part II, it is possible to fit functions of more than 
two arguments, provided the data appear as a complete factorial. This is accomplished by first combining all 
combinations of two or more factors into one factor until a two-way table is obtained. The parameter vectors 
of the SVD of this table are then expressed as two-way tables themselves and further SVD's are carried out 
The procedure is simple in principle but can become quite cumbersome in practice. It is not recommended 
for functions of more than three arguments, unless no other appropriate fitting procedure is available. 
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